home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group97b.txt
/
000021_icon-group-sender _Thu Jul 10 14:19:02 1997.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
6KB
Received: from kingfisher.CS.Arizona.EDU by cheltenham.cs.arizona.edu; Thu, 10 Jul 1997 16:53:26 MST
Received: by kingfisher.CS.Arizona.EDU; (5.65v3.2/1.1.8.2/08Nov94-0446PM)
id AA27195; Thu, 10 Jul 1997 16:53:25 -0700
Message-Id: <199707102119.OAA00541@orpheus.gemini.edu>
From: swampler@noao.edu (Steve Wampler)
Date: Thu, 10 Jul 1997 14:19:02 MST
X-Mailer: Mail User's Shell (7.2.3 5/22/91)
To: icon-group@cs.arizona.edu
Subject: Results of 'A small puzzle'
Errors-To: icon-group-errors@cs.arizona.edu
Status: RO
The puzzle was to write a procedure lcp(s1,s2) that would return
the longest common prefix of strings s1 and s2.
I won't repost the solutions that people have already sent to this
group. Instead, I'll give my own solutions and then show some timings.
--------- my solutions --------
procedure lcp0(s1,s2)
return s2 ? =s1[1+:*s1 to 0 by -1]
end
procedure lcp1(s1,s2)
return s2 ? =s1[1+:*(if *s1 > *s2 then s2 else s1) to 0 by -1]
end
procedure lcp2(s1,s2)
return s1[1+:(p := *s1 to 0 by -1)] == s2[1+:p]
end
procedure lcp3(s1,s2)
return s1[1+:(p := (many(s1**s2,s1)|0) to 0 by -1)] == s2[1+:p]
end
-------------------------------
In this list, the even-numbered procedures are 'simple', the odd
numbered try out a couple of heuristics.
Now for some timings:
As you might imagine, the speed of each algorithm is
extremely dependent on the nature of the input. Factors that
influence the speed depend include (a) the sizes of the two strings,
(b) the length of the longest common prefix, (c) the order of
the two strings given as arguments. I didn't run any tests with
*really* long strings - the linear solutions would have done
much better on long strings with short lcp's.
There is no clearly superior solution. I think the conclusion is
that if you know the nature of the strings you are dealing with,
you can pick an algorithm that does a good job with it!
I've wondered if one of the fast substring find algorithms that
are available could be adopted to the lcp() problem, but didn't
feel like trying to code one up, as I doubt they would
help as much for prefix substrings as they do in the
general case.
Also, you could produce a solution that *looks* really good
according to my test program by doing something like:
# warning untested code follows!
procedure lcp(s1,s2)
local k
static lcpTab
initial lcpTab := table()
k := s1 || ":" || s2
/lcpTab[k] := real_lcp(s1,s2)
return lcpTab[k]
end
But that would be *cheating* and potentially very costly to
use in practice...though there is a simpler cheat that
at least isn't so potentially costly, even if it isn't likely
to help in the Real World(tm).
--------------- Timing comparisons ---------------------------
The two strings are shown at the start of each time set, separated
by a :.
Results are ordered from fastest to slowest for each time set.
Time is the total time for 50,000 calls to the routine, minus
the loop overhead itself.
Multiple solutions from the same person are numbered in the
same order that they appeared in their post. (I.e. bob1()
is the first solution by Bob, bob2() is the 2nd., etc.)
: { two null strings }
carl........... 3.633
bob2........... 4.183
bob1........... 4.183
steve1......... 4.249
lcp0........... 5.750
lcp2........... 5.766
lcp1........... 6.683
jerry.......... 6.733
steve2......... 6.950
steve3......... 7.033
lcp3........... 8.000
fool:foodchain
lcp0........... 7.550
lcp1........... 8.384
lcp2........... 9.167
carl........... 10.084
lcp3........... 12.100
bob2........... 13.033
bob1........... 13.734
steve1......... 14.467
jerry.......... 16.334
steve2......... 18.917
steve3......... 26.467
foodchain:fool
lcp1........... 8.517
jerry.......... 8.550
carl........... 10.084
lcp3........... 12.084
bob2........... 12.900
bob1........... 13.700
steve1......... 14.534
lcp0........... 15.133
steve2......... 20.284
lcp2........... 21.117
steve3......... 28.717
aaaaaaaaaaaaaaab:aaaaaaaaaaaaaaac
lcp0........... 8.384
lcp1........... 9.301
jerry.......... 9.850
lcp2........... 10.084
lcp3........... 14.567
carl........... 29.268
steve2......... 30.251
steve3......... 40.051
bob2........... 44.850
bob1........... 46.218
steve1......... 49.634
aardvarkseatsmallants:smallantsareeatenbylargeardvarks
carl........... 5.283
bob2........... 5.451
bob1........... 5.650
steve1......... 5.684
steve3......... 9.234
steve2......... 10.184
lcp0........... 38.350
lcp1........... 39.217
jerry.......... 57.800
lcp2........... 71.933
lcp3........... 78.284
smallantsareeatenbylargeardvarks:aardvarkseatsmallants
carl........... 5.284
bob2........... 5.468
steve1......... 5.684
bob1........... 5.684
steve3......... 9.217
steve2......... 10.234
lcp1........... 39.284
jerry.......... 40.617
lcp0........... 54.868
lcp3........... 68.851
lcp2........... 98.450
thisistheexactsamestringastheotherone:thisistheexactsamestringastheotherone
lcp0........... 7.034
lcp2........... 7.151
lcp1........... 7.967
jerry.......... 8.101
lcp3........... 14.651
steve2......... 50.118
carl........... 62.750
steve3......... 77.267
bob2........... 96.051
bob1........... 103.884
steve1......... 112.167
-----------------------------------------------------------
I think I had about as much fun working on the timing test program
as I did on the puzzle itself! If anyone would like to see the
test program, let me know and I'll post it. (Perhaps people would
like to improve upon it...)
--
Steve Wampler - swampler@gemini.edu [Gemini 8m Telescopes Project (under AURA)]
O Sibile, si ergo, fortibus es inero.
Nobile, demis trux. Demis phulla causan dux.